loading libraries

Wrangling

Which genres have the most joyful music

Do sadder songs have more words?

Linear Model

Are certain Genre’s more sad than others? Can we see this with regression?

It does not seem that there is sufficient multicollinearity to negate the typical linear model assumption.


Call:
lm(formula = .outcome ~ ., data = dat)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.57938 -0.12137 -0.01451  0.11287  0.61145 

Coefficients:
                                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)                    -0.3179539  0.0899796  -3.534 0.000425 ***
danceability                    0.4978225  0.0322873  15.419  < 2e-16 ***
energy                          0.6104190  0.1056897   5.776 9.66e-09 ***
loudness                       -0.0106577  0.0082277  -1.295 0.195441    
acousticness                    0.2334146  0.1131236   2.063 0.039284 *  
speechiness                     0.1772236  0.0455258   3.893 0.000104 ***
tempo                           0.0001279  0.0001757   0.728 0.466931    
instrumentalness               -0.0488339  0.0204058  -2.393 0.016851 *  
liveness                        0.0200725  0.0354814   0.566 0.571685    
key                             0.0013089  0.0013788   0.949 0.342639    
`energy:loudness`               0.0174139  0.0122201   1.425 0.154398    
`energy:acousticness`          -0.0961812  0.1766591  -0.544 0.586232    
`loudness:acousticness`         0.0080406  0.0096553   0.833 0.405137    
`energy:loudness:acousticness`  0.0048689  0.0161302   0.302 0.762814    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1835 on 1254 degrees of freedom
Multiple R-squared:  0.3451,    Adjusted R-squared:  0.3383 
F-statistic: 50.83 on 13 and 1254 DF,  p-value: < 2.2e-16

It loos like danceability, energy, acousticness, and instrumentalness are significant predictors of the joyfulness/sadness of a song. Since the coefficient of instrumentalness is negative, it can be said that a song which has more instrumentalness is typically a sadder song according to the model. Unfortunately these predictors only explain 33% of the variance in valence.

What is the time trend in songs?

What about the time trend in Albums?

Are songs actually becoming less vocal-heavy over time?

Are albums becoming less vocal-heavy over time?

Black line is an LOESS line and the dark red is just an Linear model to show the general trend.

Hierarchical Clustering of Genres